home *** CD-ROM | disk | FTP | other *** search
-
- HTMLCon Version 1.8 (June, 1995)
- An HTM(L) to ASCII Document Converter
-
-
- Satore Township
- P.O. Box 750836
- Petaluma, CA 94975-0836
-
- WWW to http://www.crl.com/~mikekell/index.html
- FTP to ftp.crl.com/ftp/users/ro/mikekell/ftp
-
- This program may be distributed freely as long as no
- modifications are made to it or this documentation. We
- ask that you register this program if you find it useful.
- The registration fee of $7.00 (U.S., by check) should be
- mailed to Satore Township at the address given above. If
- you register this program and provide us with your e-mail
- address, we will provide you with the command to eliminate
- the registration request screen which appears when the
- program is initiated.
-
- E-mail to mikekell@crl.com for comments or suggestions.
-
-
- About the Program
- -----------------
-
- HTMLCon converts HTML/HTM files to standard ASCII files, making them ready
- for viewing, editing or printing with standard DOS, OS/2 or Windows tools.
- HTMLCon operates under MSDOS or under any program capable of providing an
- MSDOS session and using COMMAND.COM as a command interpreter. After
- processing the input document, output will be displayed on a viewer or
- editor of your choice, or printed if you choose.
-
- HTMLCon recognizes HTML symbology through HTML+ level as of this date.
- It will automatically detect HTML files created in either an MSDOS or
- UNIX environment and process them correctly. HTMLCon will attempt to
- process the raw HTML file such that the output is as readable as
- possible, eliminating unfavorable formatting to every extent practical.
-
- A variety of options are available as defined in the control file
- (HTMLCON.INI). The control file is necessary for the proper operation
- of HTMLCon. This file may be modified with any text editor and is
- heavily commented to allow you to set various options.
-
-
- Installation
- ------------
-
- Copy HTMLCON.EXE and HTMLCON.INI to a new directory of your choice.
- Now set the environment variable "HTMLCON" to point to the directory
- where HTMLCON.INI resides. This will allow you to run the program
- from any location on your system. For example, if you put HTMLCON.EXE
- and HTMLCON.INI in the directory C:\UTILS, use the following command
- in your AUTOEXEC.BAT file:
-
- SET HTMLCON=C:\UTILS
-
- Notice that a trailing backslash should not be used with the environment
- variable HTMLCON. Even if HTMLCon is unable to locate the HTMLCON.INI
- file it will operate, however none of the important directives in the
- HTMLCON.INI file will be used. If HTMLCon is unable to locate the control
- file it will advise of the problem, wait thirty seconds, then proceed
- with processing the files you have selected using default values.
-
- The program is now ready to run. Source files may be located in any
- directory. Output files will be created in the directory from which
- HTMLCon was run. If you are using the optional filter file (HTMLCON.FIL),
- it should be located in the same directory as HTMLCON.EXE and HTMLCON.INI.
-
-
-
- Operation
- ---------
-
- HTMLCon can be operated in the interactive mode by running "HTMLCon"
- from the MSDOS session. It can also be run without operator
- intervention by using the following command line arguments:
-
- HTMLCon input_file[.html] line_length output_file[.ASC], or
- HTMLCon input_file[.html] output_file[.ASC], or
- HTMLCon input_file[.html]
-
- where "line_length" indicates where HTMLCon should try to break a line
- for the output file, using values between 40 and 200 characters per
- line. Preferences can be stated in HTMLCON.INI as shown below. The
- default file extensions can be overridden on the command line for both
- input and output files (as well as in the HTMLCON.INI file).
-
- HTMLCon has the ability to process multiple input files. When used
- in this mode HTMLCon will automatically assign the file extension '.ASC'
- to all output files unless the default file extension has been changed
- in the HTMLCON.INI file. HTMLCon will automatically detect the multiple file
- input mode by the presence of a '*' or '?' in the input file name.
-
- For example, suppose that HTMLCon resides in the directory "C:\HTMLCON"
- and that there are several HTM/HTML files in the directory "C:\HTMLWRIT"
- that you wish to process. First, move to the "C:\HTMLCON" directory,
- then issue the command "HTMLCON C:\HTMLWRIT\*.html". HTMLCon will
- process the files, one-by-one, asking you each time if you wish to
- proceed with processing the next file. When asked if you wish to
- proceed, you will be given the following options: Y)es (the default), N)o
- (no to this file only), Q)uit (quit processing all files), or A)ll
- (process all of the remaining files without pausing).
-
- HTMLCon also has the ability to print processed files. By placing the
- following line in the HTMLCON.INI file you are able to activate printing
- capabilities:
-
- useprinter=yes
-
- This command will tell HTMLCon to query each file processed to be sent
- to LPT1. You may respond Y)es or N)o to the query (default YES). If
- the above line does not appear in the HTMLCON.INI file then HTMLCon will
- not ask about printing files after they are processed. Please note that
- HTMLCon will only use LPT1 and provides no other processing to the
- output file. HTMLCon assumes you have a printer connected to LPT1 if you
- use this option and further assumes that the printer is working
- properly.
-
- Images found in the HTM file are output as [IMAGE], HREF references as
- [*]. Forms are properly noted and marked, as is preformatted text and
- other special HTML symbols. Derivatives are ignored except when the
- text is preformatted and unless the special HTMLCON.FIL file is used.
-
- HTMLCon can make use of a special filter file (HTMLCON.FIL in the
- default directory) in order to translate HTML ENTITIES of the user's
- choice. Use of this filter is activated by the statement
- "usefilter=yes" in the HTMLCON.INI file (see below). The user may
- define up to 300 such filters in the HTMLCON.FIL file. See the
- sample HTMLCON.FIL file for further details. This is an advanced
- feature and is not necessary for non-demanding HTMLCon use.
-
- Since the HTM Language is evolving continuously, it is possible that
- HTMLCon may not recognize certain symbols properly. Also, since there
- is great variation in the creation of HTML documents, it may not be
- possible to ideally format all output. Problems with the output will be
- corrected in future versions and we ask that you let us know of any
- problems by sending us e-mail, including the original HTML document that
- is not being processed correctly.
-
-
- HTMLCon Control File
- --------------------
-
- The control file should be named HTMLCON.INI and exist in the same
- directory as HTMLCon. Here is a sample, with explanations, of the
- control file:
-
- # HTMLCon Initialization File (current through version 1.8)
- # ---------------------------------------------------------
- #
- # ----- ABOUT THE HTMLCON.INI CONTROL FILE -----
- #
- # Lines beginning with a pound sign are considered comments.
- # All other lines are considered instructions and must exactly follow
- # the format described in this sample file. Arguments are seperated
- # by an equal sign (=) which must not be preceeded or succeeded by
- # a space or tab.
- #
- #
- # ----- DEFINING THE OUTPUT LINE LENGTH -----
- #
- # Define the default point at which HTMLCon should attempt to break a
- # line for the output file. The break is not guaranteed to occur at
- # this point, but as close to it as possible to retain the syntax of
- # the input line. Default=65.
- #
- linebreak=75
- #
- #
- # ----- COLLECTING STATISTICS -----
- #
- # Statistics can be compiled and written to the output file. Default=No.
- # Use of this function does not increase the processing time and it does
- # provide some interesting information in the output file.
- #
- statistics=yes
- #
- #
- # ----- VIEWING OR PROCESSING THE OUTPUT FILE AUTOMATICALLY -----
- #
- # You may launch another program after HTMLCon finishes its work. This
- # may be an ASCII file viewer, editor, or whatever. The launched program
- # must be able to take the output file name as an argument. In order to
- # accomplish this you must provide the FULL PATH to your program. This
- # is a handy function to allow you to automatically and immediately see
- # the results of the HTMLCon conversion process.
- #
- #launchprog=c:\utils\list.com
- #
- #
- # ----- FINDING AND REPLACING THINGS -----
- #
- # Find and replace: you may specify up to 50 strings to be located in
- # the HTML file and replaced in the ASCII output file. These will be a
- # direct replacement using the two commands "find=" and "replace=". Each
- # "find" element will be replaced by a "replace" element, therefore you
- # cannot have a "find=" statement without a following "replace=" statement.
- # To specify leading or ending spaces in a statement, surround the statement
- # with quotations ("). The strings cannot exceed 40 characters each.
- #
- find=" -- "
- replace=--
- #
- # Here is an example replacing all HTMLCon reference symbols [*] with just *.
- #
- #find=[*]
- #replace=*
- #
- # Or just ignore all references altogether...
- #
- #find=[*]
- #replace=
- #
- # And replace all HTMLCon image symbols [IMAGE] with a shorter one.
- #
- #find=[IMAGE]
- #replace=[I]
- #
- # Or just ignore them altogether...
- #
- #find=[IMAGE]
- #replace=
- #
- # And replace all HTMLCon list/tab markers with two spaces.
- #
- find=->
- replace=" "
- #
- # Or replace the list/tab markers with something else...
- #
- #find=->
- #replace=|
- #
- # Or just ignore them altogether...
- #
- #find=->
- #replace=
- #
- #
- # ----- KEEPING THE AUTHOR'S ORIGINAL FORMATTING -----
- #
- # You may elect to keep the formatting characteristics of the original
- # HTML file intact. This will preserve white spaces, line breaks, etc. as
- # originally constructed by the author of the HTML page. This option
- # will also eliminate the HTMLCon tab markers (->) and replace them with
- # four spaces to indicate tab lists. Uncomment the following line to
- # preserve the original formatting:
- #
- #keepformatting=yes
- #
- #
- # ----- IGNORING HTMLCON'S MARKERS IN THE OUTPUT FILE -----
- #
- # You may choose to have HTMLCon not replace certain HTML constructs
- # with its own markers (for example, HTMLCon replaces URL references
- # with the symbol [*]). To have HTMLCon simply ignore its own symbols and
- # not reference certain items in the original HTML file, uncomment the
- # next line:
- #
- #ignoresymbols=yes
- #
- #
- # ----- PRESERVING HREF MARKERS IN THE OUTPUT FILE -----
- #
- # You may instruct HTMLCon to preserve all <A HREF...> constructs when
- # converting the HTML file. These references will be preserved intact,
- # without modification. To use this feature, uncomment the next line:
- #
- #keephref=yes
- #
- #
- # ----- ELIMINATING ADVERTISEMENTS AND DELAYS -----
- #
- # Eliminate the advertisements and delays
- # [available to registered users only]
- #
- #
- # ----- PRINTING THE OUTPUT FILE ON LPT1 -----
- #
- # If you would like the option to send the processed file to LPT1
- # then uncomment the next line:
- #
- #useprinter=yes
- #
- # Note that you may only send the processed file to a line printer
- # attached to LPT1 and that HTMLCon assumes the printer is connected
- # and operating properly.
- #
- #
- # ----- SPEED PROCESSING MULTIPLE FILES -----
- #
- # Uncomment the following line to tell HTMLCon to NEVER pause for any
- # prompt, including the call to your file viewer or other
- # post-processor.
- #
- #nopause=yes
- #
- #
- # ----- IGNORING CERTAIN FILE TYPES -----
- #
- # The following directive lists file extensions which should always be
- # ignored by HTMLCon. If an input file name contains one of these
- # extensions than it will never be processed. Note that the file
- # extension must always include the "." in this directive:
- #
- ignore=.ZIP.EXE.COM.LZH.GIF.LPG.ARC.ASC.SYS.INI.TXT.DOC
- #
- #
- # ----- USING USER-DEFINED FILTERS -----
- #
- # Uncomment the next directive to have HTMLCon apply a set of filter
- # replacements contained in the file HTMLCON.FIL in HTMLCon's default
- # directory. This filter file will find and replace HTML ENTITIES
- # in your output file.
- #
- usefilter=yes
- #
- #
- # ----- CHANGING THE DEFAULT OUTPUT FILE NAME EXTENSION -----
- #
- # HTMLCon normally uses the default file extension ".ASC" when multiple
- # files are processed or the file extension is not specified. You may
- # specify your own default file extension using the following command.
- # This file extension MUST be preceeded by a "." and contain no more than
- # three characters.
- #
- #extension=.TXT
- #
- #
- # End of file
-
-